AITopics | symbolic policy

7207ffb9888068c0ee13ae3be023cada-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 06:42:20 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Beijing > Beijing (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(21 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Neural Information Processing SystemsDec-26-2025, 01:07:13 GMT

Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way.

differentiable symbolic expression, efficient symbolic policy learning, symbolic policy, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Neural Information Processing SystemsDec-24-2025, 00:06:19 GMT

A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification. Our learning algorithm is a mirror descent over policies: in each iteration, it safely lifts a symbolic policy into the neurosymbolic space, performs safe gradient updates to the resulting policy, and projects the updated policy into the safe symbolic subset, all without requiring explicit verification of neural networks. Our empirical results show that REVEL enforces safe exploration in many scenarios in which Constrained Policy Optimization does not, and that it can discover policies that outperform those learned through prior approaches to verified exploration.

name change, neurosymbolic reinforcement learning, verified exploration, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Symbolic Distillation for Learned TCP Congestion Control

Neural Information Processing SystemsNov-14-2025, 03:27:06 GMT

However, such "black-box" policies lack interpretability and reliability, and often,

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Alameda County > Oakland (0.04)

Industry:

Information Technology (0.46)
Education (0.46)
Transportation (0.35)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

7207ffb9888068c0ee13ae3be023cada-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 21:52:30 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Beijing > Beijing (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(21 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

4574ac9854d4defe3bf119d07b817084-Paper-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 13:01:22 GMT

agent, algorithm, congestion control, (15 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Alameda County > Oakland (0.04)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks

Singh, Kamal, Marouani, Sami, Sheikh, Ahmad Al, Quang, Pham Tran Anh, Habrard, Amaury

arXiv.org Artificial IntelligenceMay-21-2025

As load and delta load increase, the policy puts more flows on the Internet link. Increasing Internet delay puts the flows on MPLS. The contribution of Internet loss seems counter intuitive as it seems to put more load on Internet Link. However, even if its coefficient is near to 1.0, the overall contribution of the term is negligible as compared to load because loss in our scenario varies from 0 to around 0.15. This applies to delay too. For minimising loss, we extract the following: a 1. 9 1 .1( 2 λ 3 + 1) 2 2λ i 5 + 10 d i 3 + u i 10 (4) This policy can be interpreted as follows, and we may refer to Figure 1 as well. The ratio starts near 0.8 and increasing load, with increasing delta, puts more traffic on Internet link. Increasing Internet delay and Internet link utilisation slightly shifts the balance towards putting more traffic on MPLS link. Distillation of symbolic equations of PPO policy: In this method, we train policy using PPO, generate trajectory data and then generate the symbolic equations using auto-regressive models [22].

machine learning, reinforcement learning, traffic, (18 more...)

arXiv.org Artificial Intelligence

2505.14459

Genre: Research Report (0.40)

Industry: Energy > Power Industry (0.43)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Neural Information Processing SystemsJan-19-2025, 06:24:21 GMT

Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way.

differentiable symbolic expression, efficient symbolic policy learning, symbolic policy, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Neural Information Processing SystemsOct-10-2024, 02:01:27 GMT

A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification. Our learning algorithm is a mirror descent over policies: in each iteration, it safely lifts a symbolic policy into the neurosymbolic space, performs safe gradient updates to the resulting policy, and projects the updated policy into the safe symbolic subset, all without requiring explicit verification of neural networks. Our empirical results show that REVEL enforces safe exploration in many scenarios in which Constrained Policy Optimization does not, and that it can discover policies that outperform those learned through prior approaches to verified exploration.

neurosymbolic reinforcement learning, safe exploration, verified exploration, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Discovering Dynamic Symbolic Policies with Genetic Programming

de Vries, Sigur, Keemink, Sander, van Gerven, Marcel

arXiv.org Artificial IntelligenceJun-21-2024

Artificial intelligence (AI) techniques are increasingly being applied to solve control problems. However, control systems developed in AI are often black-box methods, in that it is not clear how and why they generate their outputs. A lack of transparency can be problematic for control tasks in particular, because it complicates the identification of biases or errors, which in turn negatively influences the user's confidence in the system. To improve the interpretability and transparency in control systems, the black-box structure can be replaced with white-box symbolic policies described by mathematical expressions. Genetic programming offers a gradient-free method to optimise the structure of non-differentiable mathematical expressions. In this paper, we show that genetic programming can be used to discover symbolic control systems. This is achieved by learning a symbolic representation of a function that transforms observations into control signals. We consider both systems that implement static control policies without memory and systems that implement dynamic memory-based control policies. In case of the latter, the discovered function becomes the state equation of a differential equation, which allows for evidence integration. Our results show that symbolic policies are discovered that perform comparably with black-box policies on a variety of control tasks. Furthermore, the additional value of the memory capacity in the dynamic policies is demonstrated on experiments where static policies fall short. Overall, we demonstrate that white-box symbolic policies can be optimised with genetic programming, while offering interpretability and transparency that lacks in black-box models.

experiment, fitness, symbolic policy, (16 more...)

arXiv.org Artificial Intelligence

2406.02765

Country: